Evaluating Translational Correspondence using Annotation Projection
نویسندگان
چکیده
Recently, statistical machine translation models have begun to take advantage of higher level linguistic structures such as syntactic dependencies. Underlying these models is an assumption about the directness of translational correspondence between sentences in the two languages; however, the extent to which this assumption is valid and useful is not well understood. In this paper, we present an empirical study that quantifies the degree to which syntactic dependencies are preserved when parses are projected directly from English to Chinese. Our results show that although the direct correspondence assumption is often too restrictive, a small set of principled, elementary linguistic transformations can boost the quality of the projected Chinese parses by 76% relative to the unimproved baseline.
منابع مشابه
The Projector: An Interactive Annotation Projection Visualization Tool
Previous works proposed annotation projection in parallel corpora to inexpensively generate treebanks or propbanks for new languages. In this approach, linguistic annotation is automatically transferred from a resource-rich source language (SL) to translations in a target language (TL). However, annotation projection may be adversely affected by translational divergences between specific langua...
متن کاملTranslational Equivalence and Cross-lingual Parallelism: The Case of FrameNet Frames
Annotation projection is a strategy for the cross-lingual transfer of annotations which can be used to bootstrap linguistic resources for low-density languages, such as role-semantic databases similar to FrameNet. In this paper, we investigate the main assumption underlying annotation projection, cross-lingual parallelism, which states that annotation is parallel across languages. Concentrating...
متن کاملCross-lingual Abstract Meaning Representation Parsing
Abstract Meaning Representation (AMR) annotation efforts have mostly focused on English. In order to train parsers on other languages, we propose a method based on annotation projection, which involves exploiting annotations in a source language and a parallel corpus of the source language and a target language. Using English as the source language, we show promising results for Italian, Spanis...
متن کاملEvaluating Three Image Segmentation Algorithms from Two Perspectives: Segmentation Error Measures and Image Annotation
Image segmentation has an essential role in the image annotation process which assigns meaningful words to an image taking into account its content. For this reason it is important to identify which segmentation algorithm is producing better results. This evaluation can be made using segmentation error measures for consistency quantification and by analyzing the results of the annotation proces...
متن کاملAnalysis of Translational Correspondence in view of Sub-sentential Alignment
This paper reports on the first results of an empirical study of translational correspondence in different text types for the English-Dutch language pair. A Gold Standard was created, which can be used as a standard data set for evaluating subsentential alignment. The manually indicated translational correspondences were analyzed in view of different heuristics used in existing sub-sentential a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002